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(57) Abstract 



Differential image data compression systems and techniques are disclosed for use in video telephone systems consisting of 
a transmitting portion (12) and a receiving portion (38). In the transmitting section (12) a reduced grey-scale (luminance) image, 
preferably consisting of only black-and-white pixels, is compared in an image processing module (20) with a similarly reduced 
image derived from previous values to determine pixel positions that have changed. Information representative of the changes is 
then encoded and modulated and (time or frequency) multiplexed with the audio and/or chrominance signals for transmission. 
At the receiver (38), the incoming signal is demodulated and demultiplexed to separate the audio and video portions, the image 
portion is decoded and the luminance value is updated by an image updating unit (50). Adaptive resolution, pixel averaging and 
interpolation techniques are also disclosed for picture enhancement. 
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VIDEO TELEPHONE SYSTEMS 

5 BackqrQvnfl p£ the Invention 

The technical field of this invention is 
image processing and, more specifically, differential 
motion detection processes and devices. In 
10 particular, the invention relates to video telephones 
for transmitting both sound and images in real time 
over standard residential telephone lines* 

When video conferencing was first 
15 demonstrated at the New York World' s Fair in 1964, 
public expectations were raised that a new technology 
would soon render the telephone obsolete. However, 
various technical constraints have made video 
telephone systems prohibitively costly to all but a 
20 relatively small group. In particular, the amount of 

.. image dat a that must be transmitted has posed a mos t 

significant problem because such data far exceeds the 
capacity of existing standard residential telephone 
networks . 
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Researchers have attempted to overcome this 
obstacle in two ways: first, by using a different 
medium for data transmission to enable a higher data 
transfer rate; or second / by using image data 
5 manipulation techniques to compress the amount of 
data required to represent the image. This invention 
primarily is concerned with the latter approach of 
data compression. 

10 Much of the work on video conferencing has 

been directed toward data transmission over special, 
high-quality transmission lines, such as fiber 
optics, which are capable of transmitting at least 
several times as much data as standard residential 

15 telephone lines. For example, an Integrated Switched 
Digital Network (ISDN) service is being implemented 
with a 64 kbit/sec, video transmission rate to 
replace, in some instances, the standard 3 kHz 
telephone lines that can handle at best up to about 

20 20 kbit/sec, depending u pon t he signal processing 

~ emp loyedl . These special lines are relatively costly 
and currently are available only in limited areas. 

An object of this invention is to provide an 
25 image data compression process to enable video 

telephones to be used over the present, copper-based/ 
residential telephone network, as well as other low 
bandwidth transmission media. 

30 Another object of this invention is to 

provide an inexpensive video telephone that may be 
used with standard video cameras and video display 
screens or personal computers to provide 
videoconferencing capabilities between users 

35 connected to the standard residential telephone 
network. 
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Summary of the Invention 

Differential motion detection data 
compression systems and techniques are disclosed for 
5 use in low-cost, video telephone systems for full 
duplex, real time transmission of audio and video 
images over nominal 3 kHz residential telephone lines 
and other low bandwidth channels. 

10 Each video telephone device consists of a 

transmitting portion and a receiving portion. In a 
simple "black-and-white" embodiment, the transmitting 
section transforms an image (e.g., either from a 
standard video camera or an alternative imaging 

15 device such as a charge coupled device or CCD) into a 
reduced grey-scale image preferably consisting of 
only black-and-white pixels. This image is then 
compared with a similarly reduced image derived from 
previous image data to determine pixel positions that 

20 have changed. -Information representative of the 
changes between image frames is then encoded to 
further compress the data and then modulated and 
multiplexed with the audio signal for transmission 
over a standard 3 kHz telephone line. 

25 

In another embodiment, color images can be 
transmitted by decomposing the color video signal 
into its luminance and chrominance components and 
then processing the luminance values in accordance 

30 with this invention. As used herein, the term 
•'grey-scale image" is intended to encompass both 
simple "black-and-white" images and the luminance 
component of color images. Techniques for encoding 
and transmitting the chrominance values of color 

35 images, as well as reconstruction of a color image 
from the luminance and chrominance information, will 
be described below. 
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Coherent modulation/demodulation can be used 
to enable transmission and reception of the video and 
audio signals over a standard residential telephone 
line. Coherent modulation produces frequency 
5 transformations of the signals to position the 
signal bandwidth in the telephone line channel, 
nominally, 0 to 3 kHz. The coherent modulation also 
is used to enable multiplexing two analog signals 
simultaneously onto the telephone line bandwidth, as 
10 described in more detail below. Techniques for 

reducing crosstalk between the transmitted audio and 
video signals, as well as alternative frequency 
division multiplexing techniques for transmittal of 
the audio and video signals, are also disclosed below. 

15 

In another aspect of the invention, adaptive 
resolution apparatus and methods are disclosed in 
which different data compression techniques are used, 
depending on the degree of motion in the image over 
20 time. In one illustrated embodiment, /three states 
a^tTTnotibnT" llit^rm^iatenmoti^ and slow~mot ion) 
are defined and different data processing steps 
practiced in the transmitter based on the state 
determination. 

25 

The receiving section reverses the data 
compression process of the transmitting section. The 
incoming signal is demodulated and demultiplexed to 
separate the audio and video portions, the image 

30 portion is decoded, and the reduced grey-scale image 
of the previous frame is updated accordingly. Prior 
to the display of the updated image, the image can be 
transformed from a reduced grey-scale state into a 
fuller grey-scale image or a reconstructed luminance 

35 signal by overlapping and averaging blocks of pixel 
values. 
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When chrominance information is also 
encoded/ various transmission schemes can be 
employed. For example, luminance and audio 
information can be coherently modulated, as in the 
5 black-and-white case, but over a slightly narrowed 
bandwidth (e.g., over a 0 - 2500 Hz band with a first 
carrier frequency, fj, and the 1 St Q color components 
can be coherently modulated in a second band (e.g., 
over 2500 - 3000 Hz band with a second carrier 

10 frequency, f 2) * Alternatively, a luminance signal L 
and chrominance signals, e.g., X re a and Xbiue color 
signals, can be multiplexed over time. In yet 
another approach, the color signals can be sampled 
over time and then time domain multiplexed over the 

15 audio channel. 

The image data compression techniques of the 
present invention can be applied not only to video 
telephones and video conferencing systems but to 

20 graphic image storage devices, high definition 
television (HDTV), cable television, facsimile 
machines and computerized document storage on 
magnetic or optical disks. In addition, the images 
that can be processed vary from still, 

25 black-and-white characters to fast-moving, 

high-resolution color images of intricate objects. 
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The invention can also be adapted for image 
transmission over other narrow band media/ such as 
radio transmissions, through the air. In addition, 
the invention can be adapted to transmit graphic 
5 images of text generated by a computer instead of a 
video camera. The video telephones of the present 
invention also are compatible with conventional 
telephones and can receive and/or transmit audio 
signals, alone, whenever communication with a regular 

10 telephone or other audio transceiver is desired. 

Likewise , the systems of the present invention can be 
used not only with analog signals produced by 
conventional TV cameras but also with the signals 
produced by imaging devices such as CCDs and the 

15 like. These features, as well as the addition, 

subtraction or substitution of o the r components, will 
be obvious to those familiar with the art. 

It should also be noted that throughout this 
20 specification, the video telephone system has been 
^des^ribed Tn7"terms of transmission via telephone 
lines having a nominal bandwidth from about 0 to 
about 3 kiloHertz. However, telephone bandwidths 
actually are slightly offset from this range, 
25 typically operating from about 300 Hz to about 3.4 
kHz. Those skilled in the art will appreciate this 
distinction and can readily adjust the parameters 
described herein to match actual conditions in use. 
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Brief Description of the Drawings 

FIG. 1 is a schematic block diagram of a 
black-and-white video telephone system in accordance 
5 with the present invention; 

FIG. 2 is a schematic block diagram of a 
color video telephone system in accordance with thfe 
present invention; 

10 

FIG. 3 is a more detailed schematic diagram 
of an image processing module for use in the 
transmitter of FIG. 1; 

15 FIG. 4 is an illustrative matrix of dithered 

threshold values for a 4 x 4 block of pixels useful 
in a grey-scale reduction unit according to the 
invention; 

20 FIG. 5A illustrates a hysteresis process for 



25 



adjusting the dithered threshold values in a 

grey-scale reduction unit to decrease toggling and 

image flickering for a white pixel value in a 
previous frame; 

FIG* 5B illustrate^ a similar hysteresis 

adjustment for a black pixel value in a previous 
frame; 



30 FIG* 6 is a more detailed schematic diagram 

of the modulation and multiplexing elements of the 
transmitter and the demodulation and demultiplexing 
elements of the receiver of the system of FIG. 1. 
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FIG. 7 is a schematic illustration of a 
system for suppressing cross-talk between the video 
and audio signals in a system/ such as shown in 
FIG. 6. 

5 

FIG, 8 is a schematic illustration of an 
alternative modulation and demodulation approach for 
use in the present invention, 

10 FIGS. 9A-9D illustrate an averaging process 

useful in the image averaging unit of the receiver of 
FIG. 1. 

FIG. 10 is a schematic block diagram of a 
15 video telephone system employing an adaptive 
resolution module; 

FIG. 11A is an illustration of a matrix of 
dithered threshold values for coarse resolution in 
20 the adaptive system of FIG. 10. 

FIG. 11B is an illustration of a matrix of 
dithered threshold values for intermediate resolution 
in the adaptive system of FIG. 10; and 

25 

FIG. 11C is an illustration of a matrix of 
dithered threshold values for fine resolution in the 
adaptive system of FIG. 10. 
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Detailed Description 

In FIG. 1/ a video telephone system 10 in 
accordance with the present invention is shown, 
5 including a transmitter section 12, having a sampling 
unit 14, an image processing module 20 (including a 
grey-scale reduction unit 16, a frame memory 15 and a 
motion detection unit 18) , a differential image 
encoding unit 22, an optional error correction unit 

10 24, an image modulator 26 , an optional audio 

modulator 28 and a multiplexing mixer 30. System 10 
further includes a receiver section 38 having a 
demultiplexing and demodulating unit 42, an optional 
speaker 44 for audio output, an optional error 

15 detector 46, an image decoder 48, an image updating 
unit 50 (including image memory 52 and comparator 
54), a pixel averaging unit 56, and a display or 
monitor driver 58 for video output. 

20 Image data is first compressed by the 

sampling and grey-scale reduction units 14 and i6, 
then compared with a previously reduced grey-scale 
image in the motion detection unit 18 to produce 
image data representative of the changes from the 

25 previous image frame. In the final stage, the 

differential image data is further compressed by the 
encoding unit 22, such that the image data may be 
modulated and transmitted over a 3 kHz or other 
narrow bandwidth channel. 
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The transmitted image data is received by 
the receiving section 38/ as shown in FIG, 1, which 
reconstructs the new image. The receiving section 38 
also has three levels of data decompression. After 
5 demodulation/ the encoded image data is decoded by 
decoding unit 48 to provide the differential image, 
which is then used in the image updating unit 50 to 
make the designated changes to the previous image 
frame. Finally/ the image is averaged in averaging 
10 unit 56 to yield a greater range of shades of grey 
which can then be displayed on monitor 58. 

In FIG. 2, an alternative system 10A is 
shown, including transmitter 12A and receiver 38A for 

15 incorporating color information, with like references 
characters indicating elements substantially similar 
in function to those shown in FIG. 1. System 10A 
includes filter 11 which decomposes a color (e.g., 
NTSC) ^video signal into its luminance component L and 

20 two chrominance components I and Q. The luminance 
component L can be processed by module 20, encoder 24 
and modulator 26 in a manner substantially identical 
to the processing of a black-and-white image, as 
shown in FIG. 1. The luminance data can then be 

25 multiplexed with audio data via mixer 30 and 

transmitted over a portion of the bandwidth (e.g., a 
nominal frequency band ranging from about 0 to about 
2500 Hz) while a second portion of the bandwidth 
(e.g., a nominal frequency bandwidth (e.g. a nominal 

30 frequency bandwidth from about 2500 to about 3000 

Hz) . The I and Q chrominance values can be modulated 
by modulator 13 and mixed together in mixer 17. The 
luminance/audio and chrominance data can then be 
multiplexed together via mixer 19 for transmission. 
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In the receiver 38A of FIG. 2, the luminance 
and chrominance values (as well as audio signals , if 
any) are demultiplexed and demodulated by unit 42, 
and the luminance data is decoded by decoding unit 48 
5 to provide the differential image, which is then used 
in the updating unit 50 to update the image. Again, 
in a manner analogous to the process of FIG. 1, the 
luminance values can be averaged to yield a greater 
range of grey values, which are then inputed into 
10 display driver 58, together with the chrominance 
values to provide a color video output. 

With reference again to FIG. 1, the image 
processing module 20 will be described in more 

15 detail. In the first level of data compression, the 
grey-scale reducing unit 16 transforms the image by 
reducing the number of grey levels for each pixel. 
The resultant image, when viewed by the human eye 
from a distance, has an appearance which is 

20 strikingly similar to the original image)— however, it 
requires fewer bits of information. In one preferred 
embodiment, the transformation entails reducing an 
image having 256 shades of grey into two shades, 
white and black. This results in an 8-fold reduction 

25 in the data required to represent the image, as each 
pixel is converted from having an 8-bit grey-scale 
representation to a 1-bit representation. 
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To allow the compressed image data to appear 
as various shades of grey to the human eye, a 
dithering comparison/ is employed. In one preferred 
embodiment/ the grey value of each pixel is compared 
5 to a threshold value which varies with its pixel 
position. For grey values greater than the 
threshold, i.e., lighter in shade, the pixel value 
becomes 1, representative of pure white. For grey 
values less than the threshold/ the transformed pixel 
10 value becomes 0, or pure black. 

Different pixel positions have different 
threshold values which are selected to provide a 
proportional combination of black-and-white pixels 

15 such that when an area or block of pixels are viewed 
from a distance, the image appears the desired shade 
of grey. The selected threshold values produce a 
primarily white pixels for light shades of grey, and 
increasingly more black pixels per unit area for 

20 darker shades of grey. Various dithering methods 

knownrXnHEl^ in the present' 

convention. See, for example, Ochi et al., W A New 
Halftone Reproduction and Transmission Method Using 
Standard Black & White Facsimile Code, H Vol. COM-35, 

25 IEEE Transactions on Communications , pp. 466-470 

(1987), herein incorporated by reference for further 
background materials on dithering methods. 
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In the second level of data compression, the 
compressed or dithered image is then compared in a 
motion detection unit 18 to the compressed image from 
the previous image frame/ as stored in an image 
5 memory unit. The motion detection unit 18 detects 
which pixels have been changed between the two image 
frames and records the pixel locations* In one 
illustrated embodiment, pixel positions with no 
change have a value of 0/ and those that have 
10 changed, either from white to black or black to 

white, have a value of 1. The new compressed image 
is theii stored in the image memory unit for 
comparison with the next image frame. 

15 In the third level of data compression, a 

differential image is next encoded in the 
differential image encoding unit 22 to further 
compress the data prior to transmission* In a 
preferred embodiment, run-length encoding is used, 

20 ma ny versions of which are known in the art. In - - 



normal operation, the image will not change too much 
from frame to frame, leaving long series of 0 bits in 
the differential image. Run-length encoding 
represents various lengths (or runs) of consecutive 

25 O's or l's as code words. For example, in one 

embodiment, the shortest code words can be assigned 
to the most likely runs, effectively compressing the 
number of bits required to represent the data where 
there are long series of consecutive O's or l's. 

30 See, for example, Gharavi, "Conditional Run- length 
and Variable Length Coding of Digital Pictures," IEEE 
Transactions on Communications , pp. 671-677 (1987), 
incorporated herein by reference for further 
explanation of coding schemes. 

35 
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The encoded differential image data is then 
r ady for transmission over a narrow bandwidth 
channel/ particularly a 3 kHz telephone line. 
However, additional optional coding techniques, such 
5 as toward error correction, may be conducted prior to 
transmission which will be described below. 

The receiving section 38, shown in FIG* 1/ 
(or the similar receiver 38A shown in FIG 2), 

10 generally reverses and decodes the three levels of 
data compression as described in the transmitter but 
in the opposite order. In the first ley,el of data 
decompression, the encoded differential image data is 
decoded in decoding unit 48 using the reverse process 

15 as used in the differential image encoding unit 22. 
In the preferred embodiment, this decoding process 
would reverse the selected run- length coding scheme. 

In the second level of data decompression, 

20 the previous image frame as stored in the image 52 
memory^unit is updated by the differential image data 
in the image updating unit. For pixel positions in 
which a change occurred, represented by 1 in the 
preferred embodiment, the pixel value of the 
25 corresponding pixel position is changed. This would 
switch a black pixel to white and vice versa. For 
pixel positions in which the differential image value 
is 0, the pixel value of the corresponding pixel 
position remains unchanged. 

30 
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In the third level of data decompress ion , 
the updated compressed or dithered image is partially 
restored to its original image with multiple 
grey-scales in the image averaging unit- The value 
5 of each pixel position is calculated by averaging the 
pixel values of all positions within a prescribed 
region. 

In addition to the three levels of data 
10 compression and decompression, the invention may 

include an error code generator 24 and error detector 
46, as shown in dotted lines in FIG . 1. This 
adaptation may be desirable for use over noisy 
transmission lines. One commonly-used error 
15 correction technique is forward error correction 
( FEC ) , in which redundant bi t is of data are added to 
the data stream in a specified manner, such that the 
FEC decoder on the receiving section can check for 
errors due to noise. See, for example, S. Lin and D. 
20 Costello-^Srr-or— Control Coding: Fundamentals and 
Applications (Prentice-Hall, Englewood Cliffs, NJ 
1983) for a further description of FEC systems, 
incorporated herein by reference. 

25 While the FEC method is a preferred 

technique, other error correction techniques can also 
be employed. For example, a joint modulation-coding 
scheme may be used to combine 24 and 26 into a single 
unit. At the receiver 38 a corresponding 

30 demodulation-decoding unit combining 42 and 46 would 
be used. Possible choices for this technique are 
tamed frequency modulation, continuous phase 
modulation, and trellis-code modulation- Other 
choices are obvious to those familiar with the art. 
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These techniques provide noise reduction without 
increasing the signal bandwidth, but require more 
complexity. 

5 Another optional element of the invention as 

shown in FIGS. 1 and 2 is the audio modulator 28 and 
mixer 30 for multiplexing an audio signal with the 
modulated image signal for simultaneous audio and 
video transmission. When an audio modulator and 

10 mixer are used, the receiving section 38 then 

separates the audio and video portions of the signal 
by an analogous demodulation and demultiplexing unit 
42. While this combination is envisioned to be a 
highly desirable feature, particularly for video 

15 conferencing; it is not an essential element. For 

some applications, an audio portion may be 

superfluous or undesired. 

FIG. 3 shows the image processing module 20 
20 of FIGS. 1 and 2 (the grey-scale reducing unit 16, 
the frame memory uiixlT~15 and the motion detection 
unit 18) in greater detail. In particular, FIG. 3 
illustrates an embodiment which includes an 
hysteresis-dithered, thresholding process for 
25 converting a multiple grey-scale image into a 

halftone image consisting of black-and-white pixels. 
The halftone image in turn is compared to that from 
the previous image frame to provide a differential 
image, which is further processed by the image 
30 encoding unit in the third compression stage. 
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As shown in FIG. 3, the image processing 
module 20 includes an analog comparator 80/ an 
ordered dither threshold table 82, a frame memory 15, 
inverter 84, summer 86, summer 88, digital-to-analog 
5 converter 90 and an exclusive OR gate 92. 

Inputs to the image processing module 20 of 
FIG. 3 are luminance values which can be derived from 
any standard video camera that converts an image into 

10 an NTSC or similar format (such as PAL or SECAM) . 
The analog signal representative of the image from 
the camera is then passed through a clamp and sample 
circuit which provides the reduced analog image, 
which is an analog signal representative of an image 

15 screen of 128 x 128 pixels at a rate of 10 frames per 
second. This can be accomplished by sampling the 
NTSC signal 128 times per line and one time out of 
every four lines. The sampled pixel values from the 
analog signal are real numbers representative of an 

20 8-bit grey-scale consisting of 256 s hades of g r<* y 

from pure black (0) to pure white (255). 

While this is the preferred embodiment, it 
should be understood that the image size can be any 

25 N x K matrix of pixels, the frame rate may be varied, 
and any number of grey levels may be used. Such 
alternatives will be obvious to those familiar with 
the art; however, if the resolution is increased, the 
frame rate will generally have to be decreased to 

30 permit the image data to compress sufficiently to 
permit effective transmission of the real time image 
over a 3 kHz telephone line. Similarly, if the frame 
rate is increased, the image resolution will have to 
be decreased accordingly. 

35 ' 
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In an alternate embodiment, the camera, 
clamp and sample circuit may be replaced by a digital 
or analog storage device or other means for 
generating a digital or analog signal representative 
5 of an image. 

The next stage of the process entails 
converting the sampled analog pixel values into 1-bit 
binary values representative of black or white. This 

10 is accomplished in comparator 80 by comparing the 
real value of each pixel with a threshold value, as 
stored in the ordered dither threshold table 82, as 
shown in FIG. 3. The table is a digital memory 
device representative of a shade of grey for each of 

15 the 128 x 128 pixel positions. Different pixel 

positions have dif f er^nt^thre^hold ^J^evel^s_jtq_permit a 
grey area spanning a given group of neighboring pixel 
positions to be represented by combinations of 
black-and-white pixels to give the perception of the 

20 particular shade of grey when viewed from a 

ai stance^ For example, for an 8-bit grey-scale' 
spanning from pure black (0) to pure white (255), a 
medium dark grey of shade level 63 over a block of 
pixels would be converted into black-and-white pixels 

25 with about three times as many black pixels as white. 

The output of the analog comparator 82 is 
stored in frame memory 15 and also used to "dither" 
the threshold values used to process the next frame. 
30 As shown in FIG. 3, a hysteresis-ordered, dither 
threshold is implemented by inverter 84 and summers 
86 and 88 which operate to define a hysteresis band 
around each threshold value, T^y ± 6, which serves to 
reduce flicker in the analog comparator 80. 
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A set of illustrative threshold values for 
the ordered dither threshold table are shown in 
FIG. 4. The 128 x 128 pixel image is broken down 
into 4x4 pixel blocks. There are 32 x 32 
5 superblocks of these 4x4 blocks. The threshold 
values are selected to create a line-type dither 
pattern, which facilitates greater data compression 
in the preferred embodiment of the differential image 
encoding stage. As will be described below, a 
10 1-dimensional, Modified Huffman run-length encoding 
scheme compresses data effectively where there is a 
long series of the same value. 

For the example shown in FIG. 4, grey-scale 
15 63 would result in alternating black-and-white pixels 
on the first row, all black-pixels on the second— row, — 
alternative black-and-white pixels on the third row, 
and all black pixels on the fourth row. For a given 
4x4 block, this results in 12 black pixels and four 
: 20 white pixels -^- exactly, the desired 3-_to-l ratio . In __ 
addition, the efficiency of the run-length encoding 
will be maximized by this dithered pattern as every 
other row consists of continuous black pixels. 

25 As noted above, to prevent unnecessary 

flickering of certain pixels from black to white 
between image frames, the preferred embodiment 
includes a hysteresis adjustment of the dithered 
threshold values. As shown in FIGS. 5A and 5B, the 

30 hysteresis adjustment increases the threshold value 
for a given pixel position if the corresponding pixel 
position in the previous image frame is black, and 
decreases the threshold value if the corresponding 
pixel position in the previous frame is white. 
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To illustrate this process , we return to the 
example of grey level 63/ as applied to the ordered 
dither threshold table of FIG. 4. Note that the 
second position in the second row of the matrix has a 
5 threshold value of 64 , which is very close to level 
63. Minor fluctuations in the grey level that may 
occur between sampled image frames could result in 
the grey-scale oscillating between 63 and 65, for 
example, every tenth of a second, which would result 

10 in the grey level toggling between black-and-white 
every image frame. This would result in an 
unnecessary increase in the amount of data that would 
be transmitted to the receiving section. To prevent 
such unwanted toggling, each dither threshold is 

15 adjusted by a predetermined amount to ensure any 

change in shade is sufficient to warrant Joggling „_of__ 
the pixel value. 

Once the dither threshold values are 
20 adjusted with respect to the previous image frame, 
-^^^ ^e\P±mag^~i^^^pa^^ 

transform the multiple grey-scale image into a 
halftone image. As shown in FIG. 3, the adjusted 
dithered threshold value is converted from digital to 
25 analog in a D-to-A converter 90 and sent along with 
the reduced analog image to the analog comparator 
80. Each reduced analog pixel value that is greater 
than the adjusted analog threshold value becomes a 
digital output of 1 or white; each reduced analog 
30 pixel value that is less than the adjusted analog 
threshold value becomes a digital output of 0 or 
black. The analog comparator 80 also converts the 
compared results into digital values for each pixel. 
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The digital output from the analog 
comparator 80, i.e., the first level compressed 
image, is simultaneously sent to the frame memory 15 
and to the motion detection unit 18/ which is shown 
5 in FIG. 3 as being an XOR gate 92 for 1-bit adding of 
the halftone pixel values generated by the D-to-A 
analog comparator and the halftone pixel values of 
the previous halftone image frame as stored in the 
frame memory. For pixel values that did not change 
10 between frames, the output of the XOR gate is 0, For 
pixel values that changed between frames, the output 
of the XOR gate is 1, These values are then sent to 
the differential image encoding unit. 

15 As the digital output of the analog 

comparator 80 is sent to the XOR gate 92, this data 
is also sent to the frame memory 15 to replace the 
currently-stored pixel values. The new digital 
values representative of the halftone image replaces 

20 the old valu es-i n -the frame m emo ry 1 5— re p re s entat i ve 

of the previous halftone image frame. This updates 
the frame memory 15 for processing of the next frame 
in time. 

25 The final data compression stage is the 

differential image encoding scheme, as shown in 
FIG. 1. The preferred embodiment is a 1-dimensional 
Modified Huffman run-length code. This encoding 
scheme transforms long series of 0 or 1 bits into 

30 shorter codes. Integrated circuits for 
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implementation of such encoding techniques are 
commercially available, e.g., the AM7971 
compression-expansion processor chip (Advanced Micro 
Devices, Inc-, Sunnyvale, California). Alternative 
5 embodiments may be substituted for the 1-diraensional 
Modified Huffman code, such as the 2-dimensional 
Modified Huffman code, and other variations. 

The encoded differential image values may 
10 either be directly sent over a transmission line or 
multiplexed and modulated with audio portion for 
simultaneous transmission over the same bandwidth. 
FIG. 6 illustrates this process. 

15 The video signal from the encoder is 

processed by image modulatin g mod ule 26 ^comprising a 
delay modulator 76 and mixer 78. The incoming video 
signal, essentially a binary bit stream, is converted 
into a rectangular waveform of two, levels according 

20 to the following rules: a transition from one level 

to the other level is placed at the midpoint of the 

bit cell when the binary data contains a one. No 
transition is used for a zero unless it is followed 
by another zero, in which case the transition is 

25 placed at the end of the bit cell for the first 
zero. The resulting waveform is then low pass 
filtered to remove higher harmonics and yields an 
analog signal which lies within the 0-3 kHz range. 
For further details on delay modulation, see, Hecht 

30 et al., "Delay Modulation", Vol. 57, Proc. IEEE 

(Letters) , pp 1314-1316 (July, 1969). The processed 
video portion is then modulated by a cosine function 
to permit coherent (in-phase and quadrature) 
modulation and added to the audio portion, which is 

35 similarly modulated with a sine function. 
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On the receiving end, a frequency and phase 
recovery unit 40 detects and tracks the phase at 
which the signal arrives; and demodulators 43, 45 
separate the sine and cosine components of the 
5 signal, providing an audio and a video signal. The 
video signal is then further processed by a delay 
demodulation 79 to recover the original binary bit 
stream. 

10 After demodulation, the process o£ decoding 

the image in the receiver is essentially the reverse 
of that described above, with the exception of 
grey-scale recovery. Instead, a pseudo grey-scale is 
achieved by averaging individual pixel values with 

15 their neighbors* 



Alternatively, demultiplexing and 
demodulation unit 42 can also include filter elements 
83 and 85 and a video/audio recovery module 87 (shown 
20 in dotted -lines in-PIG^ -6) to suppress cross talk. - 
The output of low pass filter 85 is an audio signal 
ri and the output of high pass filter 83 is a video 
signal 
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The signals and T2 contain the 
transmitted audio and video signals plus additional 
cross-talk terms. They have the form 

5 *i - s audio + ^[Svideo] 

r 2 - s video + T 2ts a udio3 

The functions Ti[s v ia eo ] and T 2 [s v ia eo ] are the 
10 cross-talk terms. If they were absent, r^ would be 
the audio signal and r 2 would be the video signal* 
Ti[] and T2U are tranf ormations defined by the 
processing steps carried out between the transmitter 
and the receiver. Thus, the transform functions 
15 encompass the filtering as well as the multiplication 

operations ( e . g w multiplication by the carrier and 

its quadrature) that occur during modulation and 
multiplexing . 

20 Given ri and r 2 # the above equations can be 

soIved~for the audio and video signals- A practical 
method is to use recursion. Rewriting the equations 
in recursive form yields: 

25 

s audio - r l " T l[s v ideo3 
s video = r 2 " T 2t s audiol 

30 where ~ indicates an approximation. 
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A three-step recursive process can be used 
to recovering the audio and video signals, as shown 
in FIG. 7. In the first step, an initial estimate of 
s au< ji 0 (t) is produced Jby applying the transform 
5 function to an initialization value of s v i^ eo (t) 
in element 93. This initial value s° can be obtained 
from a previous image frame (with an appropriate 
delay) or, during start-up, from an initialization 
signal. The transform TiEsvideo^) 3 is then 
10 subtracted from ri in summer 94 to obtain the initial 
estimate of s au ai 0 (t). 

In the second step, ah estimate of 
s v ia eo (t+A) is produced, by applying the transform 

15 function T 2 to the estimate of s au ai 0 (t) (obtained in 
step one) in element 93 and then subtracting this 
transformed signal from r 2 (t+A) in summer 96. The 
delayed signal, r 2 (t+A) is obtained by passing the 
received video signal r 2 through delay element 97. 

20 The factor A compensate s for time delays i nherent in 
the transformations T^ and T 2 . (Although these 
delays may be different, for purposes of illustration 
they are assumed to be the same.) The output of 
summer 96 is an estimate of video signal, s v ia eo (t+A) . 

25 

In the third step, s au ^i 0 (t+2A) is produced 
by applying a delayed transform of T^ to the estimate 
of s v (t+A) in element 98 and then subtracting this 
tranformed signal from l(t+2A) in summer 99 to yield 
30 a refined audio estimate s aU( ii 0 <t+2A) . (Further 
recursions can be implemented if desired to obtain 
more refined estimates of the audio and/or video 
signals.) 
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The results of step two and three are the 
outputs of the recovery system. The time delay, A, 
associated with Tl and T2 is less than one 
millisecond, a delay which is normally imperceptible 
5 to the users. 

In FIG. 8, an alternative modulation 
apparatus is shown including data encoder 22, error 
corrector encoder 24, delay modulation 31, audio high 

10 pass filter 31, and mixer 35, in the transmitter 
section and filter elements 36, 37 and delay 
demodulator 39, in the receiver section. This 
amplified system is based on the observation that the 
necessary video data rate for normal use of a video 

15 telephone (e.g., without handwaving or gross head 

move ments) is about 2,40 0 bits per seco nd (b/s) . The 
delay modulator for a 2,400 b/s input stream can 
produce an analog signal in a band ranging from 0 to 
about 1100 Hz. 



. "TJyHEiTterinxj^ the aud£o~sxgnal l^cPthe portion 

* of the bandwidth above 1100 Hertz, the audio and 
video signals can be frequency division multiplexed 
(FDM) . That is, the video signal lies in the 0 to 
25 1100 band. The audio signal lies in the 1100 to 3000 
(or higher) band. The loss in audio signal-to-noise 
would be about 25 percent on dB, which is tolerable 
over most telephone channels. 
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The video signal in the 0 to 1100 band can 
also be moved to another part of the band by 
modulation. Such a relocation of the video signal 
may be desirable to reduce its effect on the voice 
5 quality (insofar as much of the energy in normal 
voice signals lies below 2000 Hz.) For example, by 
modulating with a 1000 Hz carrier, the video can be 
moved to the 1300 to 2100 band. A carrier recovery 
system, similar to that discussed previously, can 
10 then be used to synchronize the transmitter and 
receiver for demodulation. 

For the case of color transmissions, the 
frequency bandwidth can be further divided to provide 
15 a first band for chrominance information, a second 
~ b and f o r^uKii nance info r ma tion and a third band for 
audio information. For example, color information 
can be transmitted over a narrow band of nominal 
frequency from 0 to 500 Hz. The selection of 
-?. Q-p articular frequency- ranges for such b a hdg^rS^rirfehi 
the ordinary abilities of those skilled in the art. 

Various other modulation techniques can also 
be practiced in accordance with the invention. For 

25 example, all of the signals (or a subset, such as 
chrominance and luminance information, can be 
multiplexed over time, rather than frequency. Thus, 
one can time sample the L, X re a and X^iue signals in 
frequency and send the three images on a rotating 

30 basis. Alternatively, one can time sample the color 
I and Q signals and transmit them using time-domain 
multiplexing over the audio channel. 
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The advantage of this scheme is simplicity. 
FIG. 4 shows a block diagram. The disadvantage is 
its loss of audio signal-to-noise ratio and its 
limitation in tracking motion in the video image. 
5 There may be applications where these disadvantages 
are unimportant. 

FIGS. 9A-9D illustrate an image averaging 
process useful in the image averaging unit 56 of 

10 receiver 38 shown in FIG. 1. In the illustrated 

embodiment, 5x6 blocks of pixel values are averaged 
with the averaged value being applied to the pixel 
situated in the upper left hand corner of the block. 
This provides 30 shades of grey. In FIG. 9A, an 

15 initial pixel value is averaged; in FIG. 9B, the 
pixel in the next col umn i s average d u sing a 5 x 6 
matrix of pixel values, which is displaced one column 
to the right. In FIG. 9C, a pixel in the next row 
relative to the pixel illustrated in FIG. 9A is 

20 shown. Th is pixel is averag ed u si ng a 5 x J5 matrix 
of pixel values which is displaced one row downward 
relative to the matrix of FIG. 9A. Similarly, in 
FIG. 9D, the averaging process is illustrated for a 
pixel one row below and one column to the right of 

25 the original pixel shown in FIG. 9A. 
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Various other picture " enhancement" 
techniques can also be employed in the image 
averaging unit 56 to reduce the "blockiness" of the 
picture. For example, spatial filtering techniques 
5 can be used to average pixel values across a line or 
from one line to the next. Moreover / as discussed in 
more detail below, in some cases it is also possible 
to average pixel values over time (i.e., from one 
frame to another) to further enhance image quality. 

10 

Additionally, interpolation techniques can 
be used to "fill-in" additional data values (e.g., 
intermediate values between pixels or between 
lines). With reference again to FIG. 1, the pixel 
15 averaging unit 56 in transmitter 38 can further 
~~ : iTTclude means tor interpolating pixel values to 
improve the resolution of the reconstructed image. 
The effect of such interpolation is to smooth out the 
discontinuities in the reconstructed image/ provide a 
20 subjecti-vely-more— pleasing image and^H^w -the use of 
a larger display at the receiving end. These 
interpolation functions can take place entirely at 
the receiving end. No modification is required at 
the transmitter. 

25 

In the illustrated embodiments, the original 
scene is described by N samples per line and M lines 
per frame, corresponding the MxN picture elements or 
pixels per frame. For instance, possible choices for 
30 M and N are M = 90 samples per line and N = 128 lines 
per frame, for a total of 90 x 128 » 11,520 pixels 
per frame. For each frame, the receiver calculates 
the luminance levels at each of the MxN pixels. 



WO 91/10324 



PCT/US90/07685 



-30- 



The pixel averaging unit 56 can further 
include a resolution multiplier which introduces 
additional interpolation points or pixels in the 
reconstructed signal, specifically/ between any two 
5 consecutive pixels in a same row or in a same 
column. When one interpolation point is added 
between any two original pixels, the total number of 
pixels per frame is multiplied by 4, 

10 For the purpose of illustration, assume in 

the description that only one pixel is added between 
any such horizontal or vertical pair, and consider 
arbitrary rows Ri # Ri+i a *M3l Ri+2, anfl Columns C j , 
Cj + i and Cj + 2 in the reconstructed picture. Let us 

15 call Pij the pixel at the intersection of row and 

column C j . In one embodiment, the resolution 

multiplier can proceed as follows: 

, . ..in step one, -interpolated columns are 

20 generated* On row R^, a new pixel, jPj^j+i/a, is 

added halfway between pixels Pij and , j +i ♦ Its 
luminance, t>i,i+i/2' is equal to: 

25 a bij + (1-a) b i/j+1 , if b i/j+2 > 2 b i/j+1 - b ± j 
1/2 bij + 1/2 b i/j+ i, if b i/j+2 = 2 b i/j+1 - by 
(1-a) bij + a b i# j +1 , if b i# j +2 < 2 b i#j+1 - bij 

where a is a selectable parameter which can range 
30 typically from 0 to about 1/2. One suitable value of 
a is 1/4. 
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This procedure can next be repeated for all 
values of. j (j = 1,..., M) within row R^, and for all 
values of i (i = 1/.../ N) . This results in the 
creation of new columns, Cj+i/2' located between Cj 
5 and Cj + i over the whole display, thereby doubling the 
number of columns. 

In step two, a similar process can be 
employed to interpolate rows. On column C j , a new 
10 pixel, Pi+i/2,j is added half way between pixels P^j 
and Pi+i,j. Its luminance, bi + 2/2,j/ is equal to: 

a bij + (l-a> b i+1#j , if b i+2/ j > 2 b i+1#j 
1/2 bij + 1/2 b i+lfj , if bi +2/ j - 2 b i+lr j 
15 (1-a) bij + a b i+1#j , if b i+2/ j < 2 b i+1 ,j 



Again, a is a selectable parameter (e.g., a « 1/4). 
When step two is repeated for all possible values of 
i and j, it results in an overall doubling of the 
-__ 20 numher_of _ rows _of .each frame. 

After the enhancement has been completed, 
the number of pixel values per frame is 4MN 
consisting of MN original pixel values, and new 3MN 
25 interpolated pixel values. This enhancement process 
can be repeated any number of times. It will result 
each time in a quadrupling of the number of pixels 
values within each frame. 



'ij 
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In FIG. 10 , a system 100 is shown that 
provides for adaptive resolution in image 
processing. Adaptive resolution provides a means to 
enhance the resolution of the received picture 
5 depending upon the degree of motion existing in the 
original scene. For example, when there is animated 
motion in the scene (e.g., rapid head movement during 
a videophone conversation) , the basic resolution 
techniques described above can be applied. However, 

10 when there is slow motion in the original scene 

(i.e., the face is not moving very much) , a different 
protocol is employed. Finally, when there is no 
motion (i.e., either there is no motion at all, or 
the amount of motion is very small), yet another 

15 motion detection approach is taken. 



As shown in FIG. 10, system 100 includes a 
transmitter section 112 having a grey-scale reduction 
unit 102 , a multi-frame buffer 104 , a motion 
20 estimation unit 106, a threshold look-up table 108, 
— differential image bur t e r ~ZT%~7~ ref erence~f r r ame Buf f e r 
116, image date encoders (e.g. run length and Huffman 
coding elements) 118, and channel buffer 120, as well 
as control circuit 110. The receiver 138 includes 
25 image data decoders 140, differential buffer 142, 

reference frame buffer 144, a multi-frame buffer 146, 
and a grey-scale computer 150. 
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In FIG, 10, the image data is compressed by 
the grey scale reduction unit 102 to yield binary 
luminance values. Unit 102 applies threshold 
matrices to the incoming data (using the look-up 
5 tables stored in element 108) in a manner analogous 
to that described above in FIGS. 1-5. However, in 
this embodiment, a multiframe buffer 104 is used to 
store a series of binary frame values. These values 
are then compared by motion estimator 106 to 
10 determine the motion state (e.g., fast, intermediate 
or slow) . Depending on the motion state, different 
threshold values are selected from element 108. 

Differential buffer 114 contains the changes 
15 between the last received frame in buffer 104 and the 

reference frame from buffer 116. — The contents of the 

reference frame buffer 116 are updated at different 
times depending on the motion state, as described in 
more detail below. In the illustrated embodiment, 

20 the contents of the reference buffer will be the last 

frame when fast motion is occurring, or will be an 
average of the four most recent frames for 
intermediate motion, or will be an average of sixteen 
frames during the slow motion operating mode. 

25 

In the system of FIG. 10, the motion 
estimator 106 estimates the average amount of motion 
existing in the original scene between two instants 
in time. Motion estimation is an ongoing process at 
30 the transmitter, and every frame, a new motion 
estimate is generated. This estimate is used to 
either keep the resolution level unchanged or switch 
to a different resolution level. 
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For example, if L is a number representing 
the maximum level of motion allowed on the 
transmission channel , then a fast motion state can be 
defined as existing when the motion estimate is 
5 between the maximum motion level L and L/4. An 
intermediate motion state can be defined to exist 
when the motion estimate is between L/4 and L/16. A 
third state — slow motion — can be defined to exist 
when the motion estimate is less than L/16. 

10 

In one preferred embodiment, a change in the 
motion level in the scene can be signalled by the 
transmitter 112 to the receiver 138 by imbedding into 
the transmitted video bit stream a "resolution sync 

15 word" consisting of two additional bits of 

information per frame. In^this way, _it_J^p_ossible 

for the receiver 138 to decode the resolution sync 
word, and know the resolution level to be used in the 
reconstruction of images. Different reconstruction 

20 procedure is then used in grey level computer^J.50 for 

In the illustration of FIG. 10, motion 
estimation is based on the differential information, 

25 D(n) , D(n-l), D(n-2), D(n-3) which represent the 

changes which have occurred over the four most recent 
frames. Specifically, the differential information 
at frame F(n), is equal to the difference between 
binary (i.e. black-and-white) frame F(n) and the 

30 previous binary frame F(n-l) : 

D(n) = F(n) - F(n-l). 

F(n) is a binary matrix of 0's and I's. Let sum [M] 
35 be the sum of all the elements of the matrix M. 
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Then, the motion estimate at the time n, ME(n), can 
be defined as: 

ME(n) = sum[D(n)] + sum[D(n-l)]+ 
5 sum[D(n-2)] + sum[D(n-3)] 

= sum[F(n)] - sum[F(n-4)] 

and the motion estimate at time n+1 is 

10 

ME(n) * sum[D(n+l>] + sum[D(n)]+ 
sum[D(n-l)] + sum[D(n~2)] 

= sum[F(n+l)] - sum[F(n~3)]. 

15 



The motion estimate represents the total 
number of bit changes that have occurred over the 
past four frames. This provides a reading of the 
20— motion level at the end of each frame . The ' 



four-frame averaging process and the readout of ^the 
motion estimate are synchronized to the frame sync, 

- If ME(n) is between L and L/4, a coarse 
25 resolution level is used (e.g., same as described 

above in connection with FIGS. 3 and 4). 

- If ME(n) is between L/4 and L/16, an 
intermediate resolution level is used. 

30 

- If ME(n) is less than L/16, a fine 
resolution level is used. 



It should be clear that other threshold 
35 choices can be made in distinguishing motion states. 
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In the embodiment of FIGS 3 and 4, there was 
only one way to calculate the grey levels. This was 
done by averaging over 4x4 blocks the bit values 
within a binary frame F(n), i.e., the values at 
5 positions i, i+2, i+3 on lines j , j+1, j+2, and 

j+3. 

In other words, for frame F(n), the grey 
level, Gi,j(n)# of pixel (i,j) was defined as: 



10 



15 



Gi,j(n) -1 f F i+k/j+1 (n) 
K-0 1=0 



However, in the embodiment of FIG. 10, the 
grey level can be calculated as: 



i2G 



q P P 
Gi,j<n) =1 I I F i+k/j+1 (n-m> 
m=0 k=0 1=0 



25 The notation (pxp,q) is used to represent 



this el ass of g i e y~l evel~es tlmates^ The riot at ion" 
underlines the fact that the spatial sum of the 
binary values over a block of size pxp and the time 
sum over q frames. With this notation, a three grey 
30 level estimation scheme is illustrated in FIGS. 11A, 
11B and 11C. 

Specifically, FIG. 11A illustrates the 
course resolution level in which spatial averaging 
35 over a 4x4 pixel block from a single frame is used to 
derive a binary value in the grey-scale reduction 
unit. FIG. 11B illustrates the intermediate 
resolution level, where the grey level of a pixel is 
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derived from both spatial averaging over a 2x2 block 
and time averaging over 4 successive frames, FIG. 
11C illustrates the fine resolution level, where the 
averaging is over a lxl block, or 1 pixel, and over 
5 16 successive frames* There is no spatial averaging, 
but only time averaging. 

As the amount of motion decreases, the 
spatial averaging is decreased and more time 
10 averaging is introduced. The grey level resolution 
(e.g. thresholds) can be left unchanged at 16 levels 
or 4 bits of grey. 

When n (less than or equal to 16) successive 
15 frames are averaged, n different threshold matrices 

are used-; — In tot a 1^— the procedure uses-one 4x4 — 

threshold matrix, M, at the coarse resolution level, 
four 2x2 threshold matrices, Ml to M4, at the 
intermediate resolution, and 16 threshold levels at 

— 2Q the— £ine resolution. — The threshold matrices M, Ml, 

M2, M3, M4 are given in FIGS 11A and 11B. The 16 
scaler thresholds are the values 16, 32, 48, ... up 
to 256, i.e., values multiple of 16, as illustrated 
schematically in FIG 11C. These matrices are for 
25 illustration purposes and other matrices can perform 
equally well. 

With reference again to FIG 10, a 
multi-frame buffer 104 is used in the transmitter 112 
30 to calculate the motion level in the scene. A 
four-frame buffer is sufficient to calculate the 
estimate, once every frame. Reference buffer 116 is 
used to calculate the motion estimate and generate 
the differential information. 
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While in the coarse resolution mode or when 
switching to the coars resolution mode/ the 
previously-received frame can be used as the 
reference frame for calculating the differential 
5 information. "The same convention is used at the 
receiver. The decision to switch to a different 
resolution level (such as intermediate or fine) can 
occur at the end of any frame, 

10 Assuming that the decision to switch to 

intermediate resolution occurs immediately at the end 
of frame F(n), then the differential information < 
D(n+1), D(n+2), D(n+3), D(n+4) is calculated using 
F(n) as' the reference frame. This convention is also 

15 followed by the receiver. In other words. 





D(n+1) 


« F(n+1) - F(n) 








D(n+2) 


= F(n+2) - F(n) 








D(n+3) 


= F(n+3) - F(n) ^ 






20 


D(n+4) 


* F(n+4) - F(n) 







In addition, the transmitter does not switch 
to another resolution (coarse or fine) until all four 
25 differential frames have been transmitted. At that 
point/ whether a resolution switch occurs or not/ the 
last transmitted frame becomes the new reference 
frame. 
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If one assumes instead that a decision to 
switch to fine resolution occurred at the end of 
frame F(n) , then this frame is used as the reference 
for the next 16 frames: 

5. 

D(n+1) = F(n+1) - F(n) 
D(n+2) = F(n+2) - F(n) 

• • f . 

• • • 

10 D(n+16) = F(n+16) - F(n) . 

Again, the resolution does not switch during 
this period of time until the new pictures have been 
15 formed. 



With reference again to FIG. 10, it should 
be noted that controller 110 can be used to 
desensitize the updating mechanism of the 

20 differential— buffer 114 based upon, conditions in the_ 
channel buffer 120. When the channel buffer 120' 
exceeds a limit (defined by the transmission 
bandwidth) controller 110 can increase hysteresis by 
incrementing the dither parameter S, thereby making 

25 it more difficult to toggle a particular pixel and, 
hence, reducing the number of pixel changes recorded 
in the differential buffer 114. This same mechanism 
also provided flicker control. 
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At the receiver 138, a multi-frame buffer 
146 can be used to store data values over a series of 
frames so that the grey-level computer 150 can 
calculate the grey levels, e.g., by spatial averaging 
5 in the fast motion mode, by space and time averaging 
over four frames in the intermediate motion mode, or 
by time averaging over 16 frames in the slow motion 
mode . 

10 It should be appreciated that various 

alternative averaging techniques can be substituted 
for this method, including, for example, approaches 
in which the pixel to be averaged is centered in the 
matrix, as well as methods in which weighted values 

15 are applied to various pixel values within the matrix. 



What is claimed is: 
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Claims 

1. In a signal processing apparatus for 
image data compression, the combination comprising: 
5 * storage means for storing a reduced 

grey-scale image derived from image values; 

comparison means for comparing a 
reduced grey-scale image of a current image frame 
with a reference image from said storage means, and 
10 for generating a luminance difference signal 

representative of the pixel positions at which the 
grey-scale value has changed between a previous image 
frame and a current image frame; and 

encoding means for encoding said 
15 difference signal. 



2. The system of claim 1 wherein the 
system further comprises a grey-scale reduction means 
for reducing the number of grey levels available to 
20_represent each pixel „o.f__ai^image_f.r_ame.. 



3 . The system of claim 2 in which the 
grey-scale reduction means further comprises a 
dithered threshold means for converting a multiple 

25 grey-scale image into a halftone image. 

4. The system of claim 3 in which the 
dithered threshold means further comprises a 
hysteresis adjustment in which the threshold value 

30 applied to each pixel is modified in order to reduce 
toggling of the pixel value. 
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5. The system of claim 4 in which the 
system further comprises means for varying the 
hysteresis adjustment. 

5 6. The system of claim 1 in which the 

encoding means further comprises a run-length 
encoding means for representing series of repeating 
differential image data bits in a coded fashion, such 
that long series of said bits are represented by 
10 fewer bits. 

7. The system of claim 1 wherein the 
system further comprises a modulation means for 
modulating a carrier signal with the encoded 
15 luminance difference signal for transmission. 



8 . The system of claim 7 wherein the 
modulation means further comprises means for 
multiplexing an audio signal with said luminance 
20 difference signal. 



9. The system of claim 7 wherein the 
modulation means further comprises means for 
multiplexing a chrominance signal with said luminance 

25 difference signal. 

10. The system of claim 1 wherein the 
system further includes an adaptive resolution means 
for determining the degree of motion in successive 

30 image frames and for modifying the resolution in 
response to such determination. 
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11. In a signal processing apparatus for 
decoding and reconstructing an image from compressed 
image data, the combination comprising: 

a differential image decoding unit for 
5 decoding a difference signal representative of 
changes in a reduced grey-scale image; and 

an image updating unit for updating 
changes to a previously stored image by adding said 
decoded difference signal to said previously stored 
10 image. 

12. The system of claim 11 

which further comprises an image averaging unit for 

0 

averaging blocks of pixel values to increase the 
15 number of grey levels of said updated image* 



13. The system of claim 11 
which further comprises an image interpolating unit 
for generating a more detailed image by interpolation. 

— — — — -_. - • ' ■ • ■ • •- . ..... 



14. A method of signal processing for image 
data compression, the method comprising: 

storing a reduced grey-scale image 
derived from image values; 
25 comparing a reduced grey-scale image of 

a current image frame with a previously stored 
reference grey-scale image; 

generating a luminance difference 
signal representative of the pixel positions at which 
30 the grey-scale has changed between a previous image 
frame and a current image frame; and 

encoding said difference signal. 
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15. The method of claim 14 wherein the 
method further comprises reducing the number of grey 
levels available to represent each pixel of an image 
frame prior to storage and comparison. 

5 

16. The method of claim 15 in which the 
step of reducing the number of grey levels further 
comprises converting a multiple grey-scale image into . 
a halftone image. 

10 

17. The method of claim 15 in which the 
step of reducing the number of grey levels further 
comprises comparing a dithered threshold value to the 
luminance value of each pixel, and assigning the 

15 pixel a reduced grey-scale value based upon the 

comparison. 



IB. The method of claim 17 in which the 
step^of comparing dithered threshold value further 
20 comprises applying a hysteresis adjustment to said 
- threshold value T.n OTd^r~to reduce toggling of ~£he~ 
pixel value. 

19. The method of claim 18 in which the 
25 method further comprises varying the hysteresis 

adjustment to desensitize the comparison step. 

20. The method of claim 14 in which the 
step of encoding the luminance difference signal 

30 further comprises run-length encoding said signal 

such that commonly repeated series of image data bits 
are assigned shorter code words. 
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21. The method of claim 14 wherein the 
method further comprises modulating a carrier signal 
with the encoded luminance difference signal for 
transmission. 

5 

22. The method of claim 14 wherein the 
modulation step further comprises multiplexing an 
audio signal with said luminance difference signal. 

10 23. The method of claim 14 wherein the 

method further comprises measuring the degree of 
change in the encoded luminance difference signal and 
performing different comparisons based upon the 
degree of change. 
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